Report on the Conald Workshop on Learning from Text and the Web

نویسندگان

  • Jaime Carbonell
  • Mark Craven
  • Steve Fienberg
  • Tom Mitchell
  • Yiming Yang
چکیده

An increasing fraction of the world's information and data is now represented in textual form. For example, the World Wide Web, online news feeds, and other Internet sources contain a tremendous volume of textual information. The goal of the CONALD workshop on Learning from Text and the Web was to explore computer methods for automatically extracting, clustering and classifying information from text and hypertext sources. The workshop included ten oral paper presentations, an organized discussion by a panel of distinguished researchers, and a handful of other contributed papers. The workshop provided a good survey of the state of the art in machine learning methods applied to text processing tasks. The presented work involved a wide array of learning approaches, including nite-state-machine induction HD, MMK], neural networks that can accept advice from users SER], relational learning methods Moo, SC], statistical clustering algorithms GS, Hof, LV, YPC], boosting methods ADW], algorithms for learning with hierarchical classes Hof, MG], and active learning methods LT, NM]. A principal limitation of many of these approaches is that they do not directly reeect attempts to develop formal models of the text phenomenon of interest. The research presented at the workshop also spanned a broad range of application tasks, in-presentation of documents in information retrieval systems GS, Hof], collaborative ltering dVN], lexicon learning GBGH], query reformulation KK], text generation Rad] and analysis of the statistical properties of text MA]. In short, the state of the art in learning from text and the web is that a broad range of methods are currently being applied to many important and interesting tasks. There remain numerous open research questions, however. Broadly, the goals of the work presented at the workshop fall into two overlapping categories: (i) making textual information available in a structured format so that it can be used for complex queries and problem solving, and (ii) assisting users in nding, organizing and managing information represented in text sources. As an example of research aimed at the former goal, Muslea, Minton and Knoblock MMK] have developed an approach to learning wrappers for semi-structured Web sources, such as restaurant directories. Their method is able to induce extraction rules from small numbers of labeled examples. These learned extraction rules are then applied so that Web pages can be treated like structured databases. As an example of work geared toward the latter goal, Shavlik and Eliassi-Rad SER] have developed an approach to …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Learning and Discovery State-of-the-Art and Research Topics in a Rapidly Growing Field

This report summarizes the CONALD meeting, which took place June 11-13, 1998, at Carnegie Mellon University. CONALD brought together an interdisciplinary group of scientists, concerned with decision making based on data. This report is organized in two parts. The first part (pages 1-6) summarizes the CONALD meeting and highlights its main outcomes, beyond the individual workshop level. The seco...

متن کامل

Efficient Method Based on Combination of Deep Learning Models for Sentiment Analysis of Text

People's opinions about a specific concept are considered as one of the most important textual data that are available on the web. However, finding and monitoring web pages containing these comments and extracting valuable information from them is very difficult. In this regard, developing automatic sentiment analysis systems that can extract opinions and express their intellectual process has ...

متن کامل

Effect of Participation in the “Principles of the Morning Report Case Presentation” Workshop on Clinical Faculty Members' Performance

Background: Morning reports are one the most popular clinical education in hospital setting. The first step to improve quality of this educational method is to know about current situation. The aim of this study was to study the effect of educational workshop on quality of morning report in Golestan University of Medical Sciences. Methods: In this interventional study, using census sampling 14...

متن کامل

Comparing the Efficiency of Electronic Learning and Workshop Learning on Knowledge and Performance of Nursing Students in Controlling Nosocomial Infections

Background Being familiar with new teaching methods and comparing their result helps teachers achieve better planning for applying such methods in the future. This study is aimed on comparing the efficiency of electronic learning and workshop on knowledge and performance of nursing students in controlling nosocomial infections. Methods Two groups were selected by pre-test post-test method. Stud...

متن کامل

The effects of educational workshops holds by EDC of Tehran University of Medical Sciences on the participant faculty

., life long learning is increasingly acknowledged to be a characteristic of professionalism. New information is being generated with increasing rapidity and educators must be able to cope with it. To study the effects of educational workshops hold by EDC of TUMS on the participant faculty. Methods. The subjects of this cross-sectional descriptive study were 375 faculty members of TUMS and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998